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Abstract 

We study a distributed antenna system where L antenna terminals (ATs) are connected to a Central 
Processor (CP) via digital error-free links of finite capacity i?o, and serve K user terminals (UTs). This 
model has been widely investigated both for the uplink (UTs to CP) and for the downlink (CP to UTs), 
which are instances of the general multiple-access relay and broadcast relay networks. We contribute 
to the subject in the following ways: 1) for the uplink, we apply the "Compute and Forward" (CoF) 
approach and examine the corresponding system optimization at finite SNR; 2) For the downlink, we 
propose a novel precoding scheme nicknamed "Reverse Compute and Forward" (RCoF); 3) In both 
cases, we present low-complexity versions of CoF and RCoF based on standard scalar quantization at 
the receivers, that lead to discrete-input discrete-output symmetric memoryless channel models for which 
near-optimal performance can be achieved by standard single-user linear coding; 4) For the case of large 
Ro, we propose a novel "Integer Forcing Beamforming" (IFB) scheme that generalizes the popular 
zero-forcing beamforming and achieves sum rate performance close to the optimal Gaussian Dirty-Paper 
Coding. 

The proposed uplink and downlink system optimization focuses specifically on the ATs and UTs 
selection problem. In both cases, for a given set of transmitters, the goal consists of selecting a subset of 
the receivers such that the corresponding system matrix has full rank and the sum rate is maximized. We 
present low-complexity ATs and UTs selection schemes and demonstrate, through Monte Carlo simulation 
in a realistic environment with fading and shadowing, that the proposed schemes essentially eliminate the 
problem of rank deficiency of the system matrix and greatly mitigate the non-integer penalty affecting 
CoF/RCoF at high SNR. Comparison with other state-of-the art information theoretic schemes, such as 
"Quantize reMap and Forward" for the uplink and "Compressed Dirty Paper Coding" for the downlink, 
show competitive performance of the proposed approaches with significantly lower complexity. 

Index Terms 

Compute and Forward, Reverse Compute and Forward, Lattice Codes, Distributed Antenna Systems, 
Multicell Cooperation. 
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UT 5 

Fig. 1. Distributed Antenna System with 5 UTs and 4 ATs (e.g., K = 5 and L — 4), and digital backhaul links of rate i?o- 

I. Introduction 

A cloud base station is a Distributed Antenna System (DAS) formed by a number of simple antenna 
terminals (ATs) (TJ, spatially distributed over a certain area, and connected to a central processor (CP) via 
wired backhaul (2[-||4[. Cloud base station architectures differ by the type of processing made at the ATs 
and at the CP, and by the type of wired backhaul. At one extreme of this range of possibilities, the ATs 
perform just analog filtering and (possibly) frequency conversion, the wired link are analog (e.g., radio 
over fiber pi), and the CP performs demodulation to baseband, A/D and D/A conversion, joint decoding 
(uplink) and joint pre-coding (downlink). At the other extreme we have "small cell" architectures where 
the ATs perform encoding/decoding, the wired links send data packets, and the CP performs high-level 
functions, such as scheduling, link-layer error control, and macro-diversity packet selection. 

In this paper we focus on an intermediate DAS architecture where the ATs perform partial decoding 
(uplink) or precoding (downlink) and the backhaul is formed by digital links of fixed rate Rq. In this 
case, the DAS uplink is an instance of a multi-source single destination layered relay network where the 
first layer is formed by the user terminals (UTs), the second layer is formed by the ATs and the third 
layer contains just the CP (see Fig. [TJ. The corresponding DAS downlink is an instance of a broadcast 
layered relay network with independent messages. 
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In our model, analog forwarding from ATs to CP (uplink) or from CP to ATs (downlink) is not possible. 
Hence, some form of quantization and forwarding is needed. A general approach to the uplink is based 
on the Quantize reMap and Forward (QMF) paradigm of Q (extended in [7] where it is referred to 
as Noisy Network Coding). In this case, the ATs perform vector quantization of their received signal at 
some rate R' > Ro. They map the blocks of nR' quantization bits into binary words of length nRo 
by using some randomized hashing function (notice that this corresponds to binning if R' > Rq), and 
let the CP perform joint decoding of all UTs' messages based on the observation of all the (hashed) 
quantization bitsQlt is known [6] that QMF achieves a rate region within a bounded gap from the cut-set 
outer bound (9J, where the bound depends only on the network size and on Rq, but it is independent of 
the channel coefficients and of the operating SNR. For the broadcast-relay downlink, a general coding 



strategy has been proposed in [10] based on a combination of Marton coding for the general broadcast 
channel [ 1 1 1 and a coding scheme for deterministic linear relay networks, "lifted" to the Gaussian case. 
Specializing the above general coding schemes to the the DAS considered here, for the uplink we obtain 
the scheme based on quantization, binning and joint decoding of p2| , and for the downlink we obtain 
the Compressed Dirty-Paper Coding (CDPC) scheme of p3| . From an implementation viewpoint, both 
QMF and CDPC are not practical, the former requiring vector quantization at the ATs and joint decoding 
of all UT messages based on the hashed quantization bits at the CP, and the latter requiring Dirty-Paper 
Coding (notoriously difficult to implement in practice) and vector quantization at the CP. 



A lower complexity alternative strategy for general relay networks was proposed in [14] and goes 
under the name of Compute and Forward (CoF). CoF makes use of lattice codes, such that each relay 
can reliably decode a linear combination with integer coefficient of the interfering codewords. Thank 
to the fact that lattices are modules over the ring of integers, this linear combination translates directly 
into a linear combination of the information messages defined over a suitable finite field. CoF can be 
immediately used for the DAS uplink. The performance of CoF was examined in fT3t for the DAS 
uplink in the case of the overly simplistic Wyner model [16]. It was shown that CoF yields competitive 
performance with respect to QMF for practically realistic values of SNR. 

This paper contributes to the subject in the following ways: 1) for the DAS uplink , we consider the 
CoF approach and examine the corresponding system optimization at finite SNR for a general channel 
model including fading and shadowing (i.e., beyond the nice and regular structure of the Wyner model); 



'The information- theoretic vector quantization of (6), (7) can be replaced by scalar quantization with a fixed-gap performance 
degradation IB). 
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2) For the downlink, we propose a novel precoding scheme nicknamed Reverse Compute and Forward 
(RCoF); 3) For both uplink and downlink, we present low-complexity versions of CoF and RCoF based 
on standard scalar quantization at the receivers. These schemes are motivated by the observation that 
the main bottleneck of a digital receiver is the Analog to Digital Conversion (ADC), which is costly, 
power-hungry, and does not scale with Moore's law. Rather the number of bit per second produced by 



an ADC is roughly a constant that depends on the power consumption [17], |18|. Therefore, it makes 
sense to consider the ADC as part of the channel. The proposed schemes, nicknamed Quantized CoF 
(QCoF) and Quantized RCoF (RQCoF), lead to discrete-input discrete-output symmetric memoryless 
channel models naturally matched to standard single-user linear coding. In fact, QCoF and RQCoF can 



be easily implemented using g-ary Low-Density Parity-Check (LDPC) codes [19]-|21 1 with q = p 2 and 
p prime, yielding essentially linear complexity in the code block length and polynomial complexity in 
the system size (minimum between number of ATs and UTs). 

The two major impairments that deteriorate the performance of DAS with CoF/RCoF are the non- 
integer penalty (i.e., the residual self-interference due to the fact that the channel coefficients take on 
non-integer values in practice) and the rank-deficiency of the resulting system matrix over the g-ary finite 
field. In fact, the wireless channel is characterized by fading and shadowing. Hence, the channel matrix 
from ATs to UTs does not have any particularly nice structure, in contrast to the Wyner model case, 



where the channel matrix is tri-diagonal |15|. Thus, in a realistic setting, the system matrix resulting 
from CoF/RCoF may be rank deficient. This is especially relevant when the size q of the finite field is 
small (e.g., it is constrained by the resolution of the A/D and D/A conversion). The proposed system 
optimization counters the above two problems by considering power allocation, network decomposition 
and antenna selection at the receivers (ATs selection in the uplink and UTs selection in the downlink). 
We show that in most practical cases the AT and UT selection problems can be optimally solved by a 
simple greedy algorithm. Numerical results show that, in realistic networks with fading and shadowing, 
the proposed optimization algorithms are very effective and essentially eliminate the problem of system 
matrix rank deficiency, even for small field size q. 

A final novel contribution of this paper consists of the Integer-Forcing Beamforming (IFB) downlink 
scheme, targeted to the case where Rq is large, and therefore the DAS downlink reduces to the well-known 
vector Gaussian broadcast channel. In this case, a common and well-known low-complexity alternative 
to the capacity-achieving Gaussian DPC scheme consists of Zero-Forcing Beamforming (ZFB), which 
achieves the same optimal multiplexing gain, at the cost of some performance loss at finite SNR. IFB can 
be regarded both as a generalization of ZFB and as the dual of Integer-Forcing Receiver (IFR), proposed 
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in [22 1 for the uplink multiuser MIMO case. We demonstrate that IFB can achieve rates close to the 
information-theoretic optimal Gaussian DPC, and can significantly outperform conventional ZFB. This 
gain can be explained by the fact that IFB is able to reduce the power penalty of ZFB, due to non-unitary 
beamforming. 

The paper is organized as follows. In Section ITT] we define the uplink and downlink DAS system model, 



summarize some definitions on lattices and lattice coding, and review CoF. In Section III we consider 
the application of CoF to the DAS uplink and introduce the (novel) concept of network decomposition to 
improve the CoF sum rate. Section [IV] considers the DAS downlink and presents the RCoF scheme. In 
Section [V] we introduce the low-complexity "quantized" versions of CoF and RCoF. Section VI focuses 
on the symmetric Wyner model and presents a simple power allocation strategy to alleviate the impact 
of non-integer penalty. In the case of a realistic DAS channel model including fading, shadowing and 
pathloss, a low-complexity greedy algorithm for ATs selection (uplink) and UTs selection (downlink) is 



presented in Section VII Finally, Section VIII considers the case of large backhaul rate and presents the 



IFB scheme. Some concluding remarks are provided in Section IX 



II. Preliminaries 

In this section we provide some basic definitions and results that will be extensively used in the sequel. 



A. Distributed Antenna Systems: Channel Model 

We consider a DAS with L ATs and K UTs, each of which is equipped with a single antenna. The 
ATs are connected to the CP via digital backhaul links of rate Ro (see Fig. [T}. A block of n channel 
uses of the discrete-time complex baseband uplink channel is described by 



Y = HX + Z, 



(1) 



where we use "underline" to denote matrices whose horizontal dimension (column index) denotes "time" 
and vertical dimension (row index) runs across the antennas (UTs or ATs), the matrices 



X 



x 



A 



and Y 



contain, arranged by rows, the UT codewords x fc G C lxn and the AT channel output vectors G C lxn , 
for k = 1,...,K, and £ = 1, . . . , L, respectively. The matrix Z contains i.i.d. Gaussian noise samples 
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~ CAf(0, 1), and the matrix H = [hi, ... , hfJ T G C LxK contains the channel coefficients, assumed to 
be constant over the whole block of length n and known to all nodes. 

Similarly, a block of n channel uses of the discrete-time complex baseband downlink channel is 
described by Y = HX + Z, where we use "tilde" to denote downlink variables, X G C Lxn contains the 
AT codewords, Y, Z G i£ Kxn contain the channel output and Gaussian noise at the UT receivers, and 
H = [hi, ... , hjf] T G C KxL is the downlink channel matrix. 

Since ATs and UTs are separated in space and powered independently, we assume a symmetric per- 
antenna power constraint for both the uplink and the downlink, given by iF[||x fc || 2 ] < SNR for all k 
and by ^E[||x £ || 2 ] < SNR for all I, respectively. 



B. Nested Lattice Codes 

Let Z[j] be the ring of Gaussian integers and p be a Gaussian prime. [^Let © denote the addition 
over F p 2, and let g : ¥ p 2 — >• C be the natural mapping of F p 2 onto {a + jb : a, b G Z p } C C. We recall 



the nested lattice code construction given in |14|. Let A = {A = zT : z G % n [j]} be a lattice in C n , 
with full-rank generator matrix T G C nxn . Let C = {c = wG : w G Fp 2 } denote a linear code over 
¥ p 2 with block length n and dimension r, with generator matrix G. The lattice Ai is defined through 



"construction A" (see [23] and references therein) as 

Ai =p-yC)T + A, (2) 

where g(C) is the image of C under the mapping g (applied component-wise). It follows that A C Ai C 
p~ 1 A is a chain of nested lattices, such that |Ai/A| = p 2r and |p _1 A/Ai| = p 2 ( n_r ). 

For a lattice A and r G C n , we define the lattice quantizer Qa{l) = argmin^gA III — Al| 2 > tne Voronoi 
region Va = {r G C n : Qa(l) = 0} and [r] mod A = r — Qa(t). For A and Ai given above, we define 
the lattice code £ = Ai n Va with rate R = Mog|£| = ^-logp. Construction A provides a natural 
labeling of the codewords of £ by the information messages w G ¥p 2 . Notice that the set p~ 1 g(C)T is a 
system of coset representatives of the cosets of A in Ai. Hence, the natural labeling function / : — >• C 
is defined by /(w) = p _1 g(wG)T mod A. 



2 The prime elements of are known as Gaussian primes. In this paper, p is assumed to be a prime number congruent to 
3 modulo 4. 
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C. Compute and Forward 



We recall here the CoF scheme of [14]. Consider the i^-user Gaussian multiple access channel (G- 
MAC) defined by 

A 

y = ^2 hk - k + -' ^ 

k=l 

where h = [hi,..., /ia] T , and the elements of z are i.i.d. ~ CAf(0, 1). All users make use of the same 
nested lattice codebook C = K\ n Va, where A has second moment a\ = Yo\(v) h ll^ll 2 ^ = SNR. 
Each user k encodes its information message w fc 6 F£ 2 into the corresponding codeword t k = /(w fc ) 
and produces its channel input according to 

Xfc = [t fc + d k ] mod A, (4) 

where the dithering sequences d fc 's are mutually independent across the users, uniformly distributed over 
Va, and known to the receiver. The decoder's goal is to recover a linear combination v = EtLi °fctfe] 
mod A with integer coefficient vector a = [ai, . . . ,ax] T S ^ K [j]- Since Ai is a Z[j]-module (closed 
under linear combinations with Gaussian integer coefficients), then v G C. Letting v be decoded codeword 
(for some decoding function which in general depends on h and a), we say that a computation rate R 
is achievable for this setting if there exists sequences of lattice codes C of rate R and increasing block 
length n, such that the decoding error probability satisfies linin^oo P(v / v) = 0. 



In the scheme of |14], the receiver computes 



ay 



A' 



k=l 

[v + z^O, a, a) 



mod A 
mod A, 



where 



A 



z^h, a, a) = ^j{ah k - a fc )x fc + az 



(5) 



(6) 



fe=l 



denotes the effective noise, including the non- integer self-interference (due to the fact that ahk $ %[j] in 
general) and the additive Gaussian noise term. The scaling, dither removal and modulo-A operation in (|5]) 
is referred to as the CoF receiver mapping in the following. By minimizing the variance of z cff (h, a, a) 
with respect to a, we obtain 



a 2 (h,a) 



min °"L( h > a >a) 



SNR 



( = } a^SNR^I + hhV 1 



SNR|h H a| 2 
l + SNR||h|| 2 

a 



(V) 
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where (a) follows from the matrix inversion lemma [24]. Since a is uniquely determined by h and a, it 
will be omitted in the following, for the sake of notation simplicity. From [ 14 ], we know that by applying 
lattice decoding to y given in ([5]> the following computation rate is achievable: 

SNR 

^a H (SNR- 1 I + hh H )- 1 a/ 
where log + (x) = max{log(x), 0}. 

The computation rate R(h, a, SNR) can be maximized by minimizing a 2 (h, a) with respect to a. The 
quadratic form (m) is positive definite for any SNR < oo, since the matrix (SNR -1 ! + hh H ) -1 has 



fi (h,a,SNR) = log+( ..„.. ), <» 



eigenvalues 

f SNR/M + llhfSNR) i = 1 
Xi={ (9) 
[ SNR i > 1 

By Cholesky decomposition, there exists a lower triangular matrix L such that o" 2 (h, a) = ||L H a|| 2 . It 
follows that the problem of minimizing a 2 (h, a) over a £ Z K [j] is equivalent to finding the "shortest 
lattice point" of the L-dimensional lattice generated by L H . This can be efficiently obtained using the 



complex LLL algorithm |25|, (26J possibly followed by Phost or Schnorr-Euchner enumeration (see [27]) 
of the non-zero lattice points in a sphere centered at the origin, with radius equal to the shortest vector 
found by complex LLL. Algorithm 1 summarizes the procedures used in this paper to find the optimal 
integer vector a € Z 



Algorithm 1 Find the optimal integer coefficients 

1. Take F = L H 

2. Find the reduced basis matrix F re d, using the (complex) LLL algorithm 

3. Take the column of F re( j with minimum Euclidean norm, call it b* 

4. Let p = ||b*|| + e for some very small e > 

5. Use Phost or Schnorr-Euchner enumeration with F rec j to find all lattice points in the sphere 
centered at 0, with radius p. 

Notice that this algorithm will find for sure the point (discarded), the point b*, and possibly some 
shorter non-zero points. 



III. Compute and Forward for the DAS Uplink 

In this section we apply CoF to the DAS uplink and further improve its sum rate by introducing the idea 
of network decomposition. The scheme is illustrated in Fig. [2] where CoF is used at each AT receiver. For 
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Fig. 2. DAS Uplink Architecture using Compute and Forward: L = 4 and A' = 4. 



simplicity of exposition, we restrict to consider the same number K = L of UTs and ATs. The notation, 



however, applies also to the case of K < L addressed in Section VII when considering AT selection. The 
UTs make use of the same lattice code C of rate R, and produce their channel input x A , k = 1, . . . , K , 
according to (4i. Each AT I decodes the codeword linear combination Xe = SfcLi a ^,fctfc mod A, for a 
target integer vector a.£ = (a^i, . . . , ) T G % K \j] determined according to Algorithm 1, independently 
of the other ATs. If R < R(h.£, a^, SNR), where the latter denotes the computation rate of the G-MAC 
formed by the UTs and the £-th AT, taking on the form given in ([8]), the decoding error probability at AT 
i can be made as small as desired. Letting = / (v^) denote the information message corresponding 
to the target decoded codeword v^, the code linearity over F p 2 and the Z[j]-module structure of Ai yield 

K 

u^ = 0%,fcw (fc5 (10) 

k=l 

where = g ([at,k] mod pL\j\). After decoding, each AT t forwards the corresponding information 
message u £ to the CP via wired links of fixed Rq. This can be done if R < Rq. The CP collects all the 



10 



messages for I = 1, . . . , L and forms the system of linear equations over F p 









Ml 








= Q 




. St . 







(11) 



where we define A = [ai, . . . , sll] t and the system matrix Q = [qi, . . . , q_L] T = g^ 1 ([A] mod pL[j\). 
Provided that Q has rank K over F p 2, the CP obtains the decoded messages {wjj by Gaussian elimina- 
tion. Assuming this full-rank condition and R < R(h.£, a&, SNR) for all £ = 1,...,L, the error probability 
P(w fc / w fc for some k) can be made arbitrarily small for sufficiently large n. The resulting achievable 
rate per user is given by (15): 

R = min{i? ,mm{ii(h£,a£,SNR)}}. (12) 

Remark 1: Since each AT £ determines its coefficients vector a^ in a decentralized way, by applying 
Algorithm 1 independently of the other ATs' channel coefficients, the resulting system matrix Q may 
be rank-deficient. If K < L, requiring that all ATs can decode reliably is unnecessarily restrictive: it is 
sufficient to select a subset of K ATs which can decode reliably and whose coefficients form a full-rank 



system matrix. This selection problem will be addressed in Section |VII 







The sum rate of CoF-based DAS can be improved by network decomposition with respect to the system 
matrix Q. Although the elements of H are non-zero, the corresponding Q may include zeros, since some 
elements of the vectors a^ may be zero modulo p%[j]. Because of the presence of zero elements, the 
system matrix Q may be put in block diagonal form by column and row permutations. If the permuted 
system matrix has S diagonal blocks, the corresponding network graph decomposes into S independent 
subnetworks and CoF can be applied separately to each subnetwork such that taking the minimum of the 
computation rates over the subnetworks is not needed. Hence, the sum rate is given by the sum (over 
the subnetworks) of the sum rates of each network component. In turns, the common UT rate of each 



indecomposable subnetwork takes on the form (12). For given Q, the disjoint subnetwork components 
can be found efficiently using depth-first or breadth-first search |28]. This also essentially reduces the 
computation complexity of Gaussian elimination, which is performed independently for each subnetwork. 
We assume that, up to a suitable permutation of rows and columns, Q can be put in block diagonal form 
with diagonal blocks Q(A S ,U S ) for s = 1, . . . , S, where we use the following notation: for a matrix Q 
with rows index set [1 : L] and column index set [1 : K], Q(A,U) denotes the submatrix obtained by 
selecting the rows in A C [1 : L] and the columns in U C [1 : K]. The following results are immediate: 
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Fig. 3. DAS Downlink Architecture Using Reverse Compute and Forward: L = 4 and A" = 4. 



Lemma 1: If Q is a full-rank K x K matrix, the diagonal blocks Q(A S ,U S ) are full-rank square 
matrices for every s. ■ 
Theorem 1: CoF with network decomposition, applied to a DAS uplink with channel matrix H = 

[hi, ... , \ik] t £ C KxK , achieves the sum rate 

s 

#c oF (H,A) = ^2\A s \mm{R ,mm{R(h k ,ai k ,Sm):k£A s }}, (13) 

s=l 

where A = [ai, . . . ,ax] T is the matrix of CoF integer coefficients, and where the system matrix Q = 
g _1 ([A] modpZ[j]) has full rank K over ¥ p 2 and can be put in block diagonal form by rows and 
columns permutations, with diagonal blocks Q(A S ,U S ) for s = 1, . . . , S. ■ 

IV. Reverse Compute and Forward for the DAS Downlink 

In this section we propose a novel downlink precoding scheme nicknamed "Reverse" CoF (RCoF). 
Again, we restrict to the case K = L although the notation applies to the case of K > L, treated 



in Section VII In a DAS downlink, the role of the ATs and UTs can be reversed with respect to the 
uplink. Each UT can reliably decode an integer linear combination of the lattice codewords sent by 
the ATs. However, the UTs cannot share the decoded codewords as in the uplink, since they have no 
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backhaul links. Instead, the "interference" in the finite-field domain can be totally eliminated by zero- 
forcing precoding (over the finite field) at the CP. RCoF has a distinctive advantage with respect to its 
CoF counterpart viewed before: since each UT sees only its own lattice codeword plus the effective 
noise, each message is rate-constrained by the computation rate of its own intended receiver, and not 
by the minimum of all computation rates across all receivers, as in the uplink case. In order to achieve 
different coding rates while preserving the lattice -module structure, we use a. family of nested lattices 
A C Al C • • • C Ai, obtained by a nested construction A as described in fl4) Sect. IV. A]. In particular, 
we let A^ = p~ l g(Cg)T + A with A = Z n [j']T and with Cg denoting the linear code over F p 2 generated 
by the first m rows of a common generator matrix G, with tl < rr,-i < • ■ • < T\. The corresponding 
nested lattice codes are given by = Ai n Va, and have rate Rg = ^ log p. We let A = [aj, . . . , 8lk] T , 
where a^ G Z L [j] denotes the integer coefficients vector used at UT k for the modulo- A receiver mapping 
(see and we let Q = <7 _1 ([A] mod p1\j\) denote the downlink system matrix, assumed to have 
rank L. Then, RCoF scheme proceeds as follows (see Fig. [3]): 

• The CP sends L independent messages to L UTs (if K > L, then a subset of L UTs is selected, as 
explained in Section VII[ ). We let kg denote the UT destination of the ^-th message, encoded by Cg 
at rate Rg. 

• The CP forms the messages £ by appending r\ — ri zeros to each £-th information message 
of ri symbols, so that all messages have the same length t\. 

• The CP produces the precoded messages 



(14) 



(notice: if K > L then Q is replaced by the L x L submatrix Q({fci, . . . , [1 : L])). 

The CP forwards the precoded message to AT £ for all £ = 1, . . . , L, via the digital backhaul 

link. 

AT £ locally produces the lattice codeword u e = /(AJ € £1 (the densest lattice code) and transmits 
the corresponding channel input according to (j4j). Because of linearity, the precoding and the 









h 








= Q 1 




. Ml . 







encoding over the finite field commute. Therefore, we can write [£y , . . . , Li 



.TlT 



B[£[,. 



mod A, where t £ = f (wg) and B = ^(Q _1 ). 

Each UT ki applied the CoF receiver mapping as in ([5]), with integer coefficients vector a/^ and 
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scaling factor a^, yielding 



Yk t 



~T 



f 1 



mod A 



4b 



t-i 



+ Heft ( h k e , &k e , «fc e ] 



mod A 



(«) 



(6) 



mod pZ[j] 



mod A 



it + z cf f( h fc f ,afc^«fc f 



mod A, (15) 

where (a) is due to the fact that [p t] mod A = for any codeword t G A^, and (b) follows from 
the following result: 

Lemma 2: Let Q = <7 _1 ([A] mod pZ [?']). Assuming Q invertible over F P 2, if B = sCQ -1 ), then: 

[AB] modpZ[j] = I. (16) 
Proof: Using [A] mod pZ\j] = g(Q), we have: 

[AB] modpZ[j] = [([A] mod pZ[j]) B] mod pZ[j] (17) 
= foCQMQ -1 )] modpZ[j] (18) 
= b(QQ -1 )] modpZ[j] (19) 
= I. (20) 



From (15 1 we have that RCoF induces a point-to-point channel at each desired UT kg, where the the 



integer-valued interference is eliminated by precoding, and the remaining effective noise is due to the 
non-integer residual interference and to the channel Gaussian noise. The scaling coefficient ctk e and the 
integer vector are optimized independently by each UT using ([7]) and Algorithm 1. It follows that the 
desired message can be recovered with arbitrarily small probability of error if Re < i?(h^,afc 4 , SNR), 
where the latter takes on the form given in ([8]). Including the fact that the precoded messages can be sent 
from the CP to the ATs if R\ < Rq, we arrive at: 
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Theorem 2: RCoF applied to a DAS downlink with channel matrix H = [hi, . . . , h^] T G C LxL 
achieves the sum rate 



-Rrcof(H, A) = ^mm{i? ,i?(h^,SNR)}. 



1=1 



Remark 2: When the channel matrix H has the property that each row I is a permutation of the first 



row (e.g., in the case H is circulant, as in the Wyner model |16|), each UT has the same computation 



rate and hence a single lattice code C = C\ = ■ ■ ■ = Cl is sufficient. 

V. Low-Complexity Schemes 



This section considers low-complexity versions of the schemes of Sections III and IV using one- 
dimensional lattices and scalar quantization. Our approach is suited to the practically relevant case where 
the receivers are equipped with ADCs of fixed finite resolution, such that scalar quantization is included 
as an unavoidable part of the channel model. In this case, CoF and RCoF, as well as QMF and CDPC, are 
not possible since lattice quantization requires to have access to the unquantized (soft) signal samples. 

The quantized versions of CoF and RCoF follow as a special cases, by choosing the generator matrix of 
the shaping lattice A to be T = rl, with t = \/6SNR in order to satisfy the per-antenna power constraint 
with equality. The resulting lattice code is C = Ai n Va with A = rZ n [j] and Ai = {r/p)g(C) + A, 
for a linear code C over F p 2 of rate R = ^ log p. Furthermore, we introduce a scalar quantization stage 
as part of each receiver. This is defined by the function Qr T /p)Z\j]{')' a PP ne d component-wise. Since A 
is the n-dimensional complex cubic lattice, also the modulo-A operations in CoF/RCoF are performed 
component-wise. Hence, we can restrict to a symbol-by-symbol channel model instead of considering 
n-vectors as before. 



Consider the same G-MAC setting of Section II-C Given the information message w fc G ¥ 7 p2 , encoder 



k produces the codeword c k = w fc G and the corresponding lattice codeword t k = ,f(w fc ) = (r/p)g(c fc ) 
mod A. The z-th component of its channel input x. k is given by 

Xk,i = [tk,i + dk,i) mod rZ\j], (21) 

where the dithering samples axe, i.i.d. across users and time dimensions, and uniformly distributed 
over the square region [0, r) +j[0, r). The received signal is given by ((3). The receiver selects the integer 
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Fig. 4. Implementation of the modulo A operation (analog component-wise sawtooth transformation) followed by the scalar 
quantization function Q n, (■) function. 

(T/p)£j[j] 



coefficients vector a = (ai, . . . , ax) T £ Z [j] and produces the sequence u G F™ 2 with components 



-i 



Q {T / P )Z[j] [ayi - akdk ^ 



K 



- ( ^a k t kii + &(h,a,a) 



mod rZ[j] I I 
mod 



for i = 1 , . . . , n, where 



A" 



£i(h,a, a) = ^(ah k - a k )x kji + 



azi. 



(22) 
(23) 

(24) 



fe=i 



Since ^t k: i G by construction, and using the obvious identity Q%^{v + £) = v + Q^uyxC) with 
v E and £ G C, we arrive at 

K 

u = ( q k c k ) C(h, a, a), (25) 

fc=i 

where q k = g~ l ([a k ] mod and where the components of the discrete additive noise £(h, a, a) are 

given by Ci(h, a, a) = g -1 ([Q^r.j((p/T)£j(h, a, a))] mod pZ[j]). This shows that the concatenation of 
the lattice encoders, the G-MAC and the receiver mapping ([22]) reduces to an equivalent discrete linear 



additive-noise finite-field MAC (FF-MAC) given by ( |25| ). 

Remark 3: Notice that u is obtained from the channel output y by component-wise analog operations 
(scaling by a and translation by Y2k=i a fc^fc)> scalar quantization and modulo A reduction. In fact, 
the scalar quantization and the modulo lattice operations commute, i.e., the modulo operation can be 
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performed directly on the analog signals by wrapping the complex plane into the Voronoi region of rZ\j], 
and then the scalar quantizer Qr T / p )Z\j](') can ^ e a PPh e d to the wrapped samples. This corresponds to 
the analog sawtooth transformation, followed by scalar quantization, applied to the real and imaginary 
parts of the complex baseband signal, as shown in Fig. |4j 
The marginal pmf of (h, a, a) can be calculated numerically, and it is well approximated by assuming 
(p/r)^j(h, a, a) ~ CN(0, In Appendix |a| we obtain an accurate and easy way to calculate the pmf 
of the effective noise component Q(h, a, a) based on such Gaussian approximation. The optimal choice 



of a and a for the discrete channel ( 25 1 consists of minimizing the entropy of the discrete additive noise 
i?(Ci(h, a, a)). However, this does not lead to a tractable numerical method. Instead, we resort to the 
minimization of the unquantized effective noise variance <r|, which leads to the same expression ^ and 
integer search of Algorithm 1. We assume that a and a are determined in this way, independently, by 
each receiver, and omit a from the notation. 



In the following, we will present coding schemes for the induced FF-MAC in ( 25 1 and for the 
corresponding Finite-Field Broadcast Channel (FF-BC) resulting from the downlink, by exchanging the 
roles of ATs and UTs. We follow the notation used in Sections III and IV and let Q = <7 _1 ([A] 



mod pZ[j]) and Q = g X ([A] mod pL[j\) denote the system matrix for the uplink and for the downlink, 
respectively. 

A. QCoF and LQF for the DAS Uplink 

In this section we present two schemes referred to as Quantized CoF (QCoF) and Lattice Quantize and 
Forward (LQF), which differ by the processing at the ATs. QCoF is a low-complexity quantized version 
of CoF. The quantized channel output at AT £ is given by 

H£ = v e e((hf,a,), (26) 

where, by linearity, y_i = ©a~i l£,k^k * s a codeword of C. This is a point-to-point channel with discrete 
additive noise over F p 2. AT £ can successfully decode v £ if R < 21ogp — H(((h^, a^)), This is an 
immediate consequence of the well-known fact that linear codes achieve the capacity of symmetric 



discrete memoryless channels |29 1. If R < Rq, each AT £ can forward the decoded message linear finite- 
field combination to the CP, so that the original UT messages can be obtained by Gaussian elimination 
(see Section III ). With the same notation of Theorem [TJ including network decomposition which applies 



verbatim here, we have: 
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Theorem 3: QCoF with network decomposition, applied to a DAS uplink with channel matrix H = 

[hi, ... , \ik] t 6 C KxK , achieves the sum rate 

s 

min {Rq, min{2 logp — H(((hk, a^)) : k £ A s }} , (27) 



Next, we consider the LQF scheme, which may provide an attractive alternative in the case 2 logp < Ro, 
i.e., when Rq is large and a small value of p is imposed by the ADC complexity and/or power consumption 
constraints. In LQF, the UTs encode their information messages by using a family of nested linear codes 



{Ck} yielding the corresponding family of nested lattice codes as explained in Section IV in order 
to allow for different coding rates {Rk}- In LQF, the ATs forwards its quantized channel observations 
directly to the CP without local decoding. Hence, LQF can be seen as a special case of QMF without 



binning. From (26 1, the CP sees a FF-MAC with L-dimensional output: 



Hi 








C(hi,ai) 




= Q 




e 












_ C(h L ,a L ) _ 



(28) 



The following result provides an achievable sum rate of LQF subject to the constraint 2 logp < Rq. 

Theorem 4: Consider the FF-MAC, c 
sum rate is achievable by linear coding 



Theorem 4: Consider the FF-MAC, defined by Q G ¥^ 2 xK as in (|28|>. If Q has rank K, the following 



K 



-Rff-mac = 2K\ogp- ^iT(C(h fc ,a & )). 



(29) 



fc=i 



Proof: See Appendix [B] ■ 
The relative merit of QCoF and LQF depends on Rq, p, and on the actual realization of the channel 
matrix H. In symmetric channel cases (i.e., Wyner model [[To*}), where the AT have the same computation 
rate, QCoF beats LQF by making p sufficiently large. On the other hand, if the modulation order p is 
predetermined as in a conventional wireless communication system, and this is relatively small with 
respect to Rq, LQF outperforms QCoF by breaking the limitation of the minimum computation rate over 
the ATs. 
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B. RQCoF for the DAS Downlink 



Exchanging the roles of AT s and UTs and using (25 1, the DAS downlink with quantization at the 
receivers is turned into the FF-BC 

C(hi,ai) 













= Q 




e 











C(h K ,a K ) 



(30) 



The following result yields that simple matrix inversion over F p 2 can achieve the capacity of this FF-BC. 
Intuitively, this is because there is no additional power cost with Zero-Forcing Beamforming (ZFB) in 
the finite-field domain (unlike ZFB in the complex domain). 



Theorem 5: Consider the FF-BC in (30) for K = L. If Q has rank L, the sum capacity is 

L 

CW = 2L\ogp - Y, H{((h i: a e )). 



(31) 



and it can be achieved by linear coding. 

Proof: See Appendix [C] ■ 
Motivated by Theorem |5J we present the RQCoF scheme using finite-field matrix inversion precoding 
at the CP. As for RCoF, we use L nested linear codes Cl Q ■ ■ ■ Q C± where Cg has rate Re = ^ logp 
and let ki denote the UT destination of the ^-th message, encoded by Cg. The CP precodes the zero- 
padded information messages {w^ : I = 1, . . . ,L} as in ( 14 1 and sends the precoded message to AT 
£ for all £ = 1, . . . , L, via the digital backhaul link. AT £ generates the codeword = p^G G C\, and 
the corresponding transmitted signal x £ according to (21 1, with t_ t = f{fi e ) Each UT kg produces its 



quantized output according to the scalar mapping (22i and obtains: 



/ 

w 1 G 



w L G 



J 



(32) 



where = w^G is a codeword of Q. Thus, UT kg can recover its desired message if < 21ogp 
H(C(h.k t ,B.kt))- Summarizing, we have: 
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Theorem 6: RQCoF applied to a DAS downlink with channel matrix H = [hi, . . . ,h^] T S C LxL 
achieves the sum rate 

L 

i? RQCoF (H,A) = ^min{i? ,21ogp-F(C(h^))}. (33) 

e=i 

m 

VI. Comparison with Known Schemes on the Wyner Model 

In order to obtain clean performance comparisons with other state-of-the art information theoretic 
coding strategies, we consider the symmetric Wyner model [16], which has been used in several other 
works for its simplicity and analytic tractability. In particular, we consider comparisons with Quantize 
reMap and Forward (QMF) and Decode and Forward (DF) for the DAS uplink, and Compressed Dirty 
Paper Coding (CDPC) and Compressed Zero-Forcing Beamforming (CZFB) for the DAS downlink. 

In the symmetric Wyner model with L ATs and L UTs, the received signal at the £-th receiver (AT 
for the uplink or UT for the downlink) is given by 

y e = x £ + >y(x e _ 1 +x e+1 ) + z e (34) 

where 7 G (0, 1] quantifies the strength of inter-cell interference and z e has i.i.d. components ~ CAA(0, 1). 



A. Review of some Classical Coding Strategies 

1) QMF: Each AT performs vector quantization of its received signal at some rate R' > Rq and 
maps the blocks of nR! quantization bits into binary words of length uRq by using a hashing function 
(binning). The CP performs the joint decoding of all UTs' messages based on the observation of all the 



(hashed) quantization bits. Using random coding with Gaussian codes and random binning, |12| proves 
the following achievable rate of QMF: 

R QMF = max^min {|<S|(i? - r) + logdet (i + SNR(1 - 2" r )H(5 c , [1 : L])H(S C , [1 : L]) H ) }. 

(35) 

As Rq — > 00, -Rqmf tends to the sum rate of the underlying multi-antenna G-MAC channel with L users 



and one L- antennas receiver. For SNR — > 00 and fixed Rq, then -R QM f — > LRq [12|, |15|. While for a 



general channel matrix computing (35 1 is non-trivial, a remarkable result of [12] is that for the Wyner 



model in the limit of L — > 00 the QMF rate per user can be simplified to 

RqMF, per-user = F\T ), (36) 
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where 

F(r) 



J log (l + SNR (1 - 2- r ) (1 + 27cos(27r0)) 2 ) dO. 



and where r* is the solution of the equation F{r) = Ro — r. A simplified version of QMF does not 
use binning, and simply forwards to the CP the quantization bits collected at the ATs. We refer to this 
scheme as Quantize and Forward (QF), without the re-mapping. In this case, the quantization rate is 
R' = Rq. From | [30| , the achievable sum rate of QF is given by 

R QF = logdet(l + SNRDHH H ) , (37) 

where D = diag(l/(l + D t ) : £ = 1, • • • , L) and D t = (1 + SNR||h £ || 2 )/(2 i?0 - 1) denotes the variance 

of quantization noise at AT I. 

2) DF: In the Wyner model, each AT £ sees the three-inputs G-MAC formed by UTs £ — 1, £ and 

£ + 1. Imposing either to treat interference as noise, or to decode all messages at each AT, yields p2j : 

, / SNR \ 

fil = log ( 1+ lT^SNR) 

R 2 = mm{^log(l + 2 7 2 SNR),^log(l + (l + 27 2 )SNR)} 
-Rsum = L x min{max(i?i, R2), Ro}- 
This scheme has no joint-processing gain. However, when Rq is sufficiently small compared to the rates 



achievable over the wireless channel, or when 7 is very small, this scheme can be optimal [12], 1 15 1. In 
fact, DF is what is implemented today in a network of small cells, where each AT operates as a stand- 
alone base station, and the decoded packets are sent to a common node that may use packet selection 
macro-diversity, in the case some of the base stations fail to decode. Therefore, it is useful to compare 
with DF to quantify the potential gains of other schemes with respect to current technology. 

3) CDPC: We focus now on the DAS downlink. In CDPC the CP performs joint DPC under per- 
antenna power constraint and sends the compressed (or quantized) DPC codewords to the corresponding 
ATs via wired links. As a consequence, the ATs also transmit quantization noise. Let be the DPC- 
encoded signal to be transmitted by AT £ and let \_ e denote its quantized version. Define aj = iE[||vJ 2 ] 
and a\ = ^E[||v^|| 2 ]. From the standard rate distortion theory, an achievable quantization distortion Di 
is given by 

R(D e ) = min I(V t ;V t ) 

p^ lVe -E[\\v t -V e \n<D e 

< I(V e ;Vi) = log(l + aj/D e ), 
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where the upper bound follows from the choice Vg = Vg + Zg with Zg ~ CJ\f(0, Dg) and Vg with variance 
of. Letting Rq = log(l + aj/Dg) and solving for Dg we obtain 

aj 

D <=2*=T (38) 
Using the fact that aj = aj + Dg, the per-antenna power constraint a\ < SNR imposed at each AT I 



yields 



aj < SNR 2 ° 1 for£ = 1,...,L. (39) 

2 



Using ([39]) in (|38]>, we obtain Dg = SNR 2~ Ro for ^ = 1, L. At the £-th UT receiver, the variance of 
the effective noise is given by 

of = 1 + ||h^|| 2 SNR 2 _jRo . (40) 

Then, an achievable sum rate of CDPC is equal to the sum capacity of the resulting vector BC with 
the above modifications (i.e., per-antenna power constraint and noise variance). This can be computed 



using the efficient algorithm given in [31 1, based on Lagrangian duality. Further, the closed form rate- 
expression was provided in |l3j for the so-called soft-handoff Wyner model, a simplified variant of the 
Wyner model where each receiver has only one interfering signal from its left neighboring cell. While 
CDPC is expected to be near optimal for large Rq, it is generally suboptimal at finite (possibly small) 
i?o- 

4) CZFB: CP performs precoding with the inverse channel matrix B = H and sends the compressed 
ZFB signals to the corresponding ATs via wired links. As in CDPC, the ATs forward also quantization 
noise, such that the variance of effective noise at the ^-th UT is given again by (HOb. The transmit power 



constraint (39 1 holds verbatim. Because of the non-unitary precoding, the useful signal power is given 



by SNR X|| b li2 where h J is the ^" th 

row of the precoding matrix B. It follows that CZFB achieves the 



' 2^0 ||b 
sum rate 



^ = + SNR /i' b ''i 2 V (4,) 

V l + (l + ||h|PSNR)/(2«»-l)/ 

B. Numerical Results 

Thanks to the banded structure of the Wyner model channel matrix, the resulting system matrix of CoF 
(resp., RCoF) is guaranteed to have rank L although every AT (reps., UT) determines its integer coeffi- 
cients vector in a distributed way. In addition, the non-integer penalty which may be relevant for specific 
values of 7 can be mitigated by using a power allocation strategy, in order to create more favorable channel 



coefficients for the integer conversion at each receivers. In [15 1 a further improved strategy is proposed 



22 




R (bits per channel use) 



Fig. 5. SNR = 25dB and L — oo. Achievable rates per user as a function of Ro, for the DAS uplink in the Wyner model 
case with inter-cell interference parameter 7 = 0.7. 




Fig. 6. SNR = 25dB and L = 10. Achievable sum rates as a function of Ro, for the DAS downlink in the Wyner model case 
with inter-cell interference parameter 7 = 0.7. 
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based on superposition coding, where the user messages are split into two layers, and one layer is treated 
as noise while the other is treated by CoF. Here we focus on simple power allocation, since it is simpler, 
practical, and captures a significant fraction of the gains achieved with superposition coding. The power 
allocation strategy works as follows: odd-numbered UTs (resp., ATs) transmit at power f3P and even- 
numbered UTs (reps., ATs) transmit at power (2— /3)P, for j3 6 [0, 1]. The role of odd- and even-numbered 
UTs (or ATs) is alternately reversed in successive time slots, such that each UT (resp., AT) satisfies its 
individual power constraint on average. Accordingly, the effective coefficients of the channel for odd- 
numbered and even-numbered relays are h = [7-^2 — /3, y/J3, -y^2 — /3] and h e = [71/?, \/2 — /3, 7\//3]. 
For given 7, the parameter /3 6 [0, 1] can be optimized to make the effective channels better suited for 
the integer approximation in the CoF receiver mapping. We have two computation rates, i?(h ,a ) and 
i?(h e ,a e ), at the odd and even numbered receivers. The achievable symmetric rate of CoF (or RCoF) 
with power allocation is given by min{it!o> -R(h , a D , SNR), R(h e , a e , SNR)}. Notice that the odd- and 
even-numbered relays can optimize their own equation coefficients independently, but the optimization 
with respect to j3 is common to all, and the computation rate is the minimum computation rate over all 
the relays, since the same lattice code C is used across all users. In Fig. [5] we show the performance of 
various relaying strategies for the DAS uplink with SNR = 25 dB, as a function of backhaul rate Rq. 
L = 00 is assumed in order to use the simple rate expression of QMF in ( [36] ). Fig. [5] shows that the 
power allocation strategy significantly reduces the integer approximation penalty and almost achieves the 
cut-set bound outer bound (i.e., capacity) for Rq < 7 bits. Not surprisingly, QCoF with p = 251 only 
pays the shaping penalty with respect to CoF, i.e., it approaches the performance of the corresponding 
high-dimensional scheme within w 0.5 bit per complex dimension. 

We observe a similar trend for the downlink schemes, shown in Fig. [6] In this case, the achievable 
sum rate of RCoF with power allocation is given by 

i? sum = ^(min{i?o,-R(ho,a ,SNR)} + min{ J R , J R(h e ,a e ,SNR)}), (42) 

where the average, sainted of the minimum, between odd and even numbered UTs is due to the fact that 
in RCoF we can use two different lattice codes and therefore the rates aren't constrained to be all equal. 
RCoF outperforms CDPC for Rq < 6.5 bits per channel use. 

It is remarkable to observe that the fully practical and easily implementable quantized schemes QCoF 
and RQCoF can outperform other conventional practical schemes such as DF and CZFB, respectively. 
These results show that CoF and RCoF are good candidate for DAS uplink and downlink, respectively, 
in particular in the regime of small to moderate Rq and high SNR. This regime is relevant for small 
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cell networks with limited backhaul cooperation, where the backhaul becomes the system bottleneck. 
Further, we observed that the proposed schemes can be significantly improved by mitigating the impact 
of the non-integer penalty. In this model, power allocation is effective due to the system symmetric 
structure. However, it is not clear how to extend the power allocation approach in the general case of a 
wireless network whose channel matrix is the result of fading, shadowing and pathloss, and therefore it 
does not enjoy any special easily parameterized structure. In the next section we address the case of a 
general wireless network with random channel coefficients, and show that multiuser diversity (i.e., AT/UT 
selection) can greatly improve the performance of the basic schemes. 

VII. Antenna and User Selection 

Since the proposed schemes require an equal number of ATs and UTs active at each given time, in 
a general DAS with K UTs and L ATs the system must select which terminals are active in every 
scheduling slot. We define the "active" set of UTs U C [1 : K] as the subset of UTs that are actually 
scheduled for transmission (resp., reception) on the current uplink (resp., downlink) slot, comprising n 
channel uses. Similarly, the "active" set of ATs A C [1 : L] is defined as the subset of ATs that are used 
for reception (resp., transmission) on the current uplink (resp., downlink) slot. 

A. Antenna Selection for the DAS Uplink 

We assume that the active set of UTs is fixed a priori. Without loss of generality, we can fix U = [1 : K] 
and assume K < L. Our goal is to select a subset A C [1 : L] of ATs of cardinality K. Recall that every 
AT chooses the integer combination coefficients, and therefore its vector q^, using Algorithm 1 in order 
to maximize its own computation rate Rg = R(h.£, &£, SNR). The CP knows {q^,it^ : £ € [1 : L]}. The 
CP aims at maximizing the sum rate such that the resulting system matrix is full-rank, by selecting a 
subset of ATs for the given UT active set U. 

1) AT selection for CoF (or QCoF): From Theorem [TJ the AT selection problem consists of finding 
A solution of: 

S(A) 

^max ^ ^2 I A | min{ J R , mm{Re : £ £ A s }} (43) 

subject to Rank(Q(A,U)) = \U\, (44) 

where S(A) indicates the number of disjoint subnetworks with respect to Q(A,U). This problem has no 
particularly nice structure and the optimal solution is obtained, in general, by exhaustive search over all 
\U\ x \U\ submatrices of Q([l : L],U). Yet, we notice that if an optimal solution A* does not decompose 
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(i.e., S(A*) = 1), the simple greedy Algorithm 2 given below finds it (see Lemma [5]). Namely, there 
exists a low-complexity algorithm to find an optimal AT selection for dense networks whose system 
matrix Q([l : L],U) cannot be decomposed in block-diagonal form. 

In general, we may have several disjoint subnetworks, each of which does not decompose further, 
even when removing some ATs. Then, we can perform antenna selection by using Algorithm 2 on each 
subnetwork component. If the optimum solution of each subnetwork component does not involve further 
network decomposition, by Lemma [3] we are guaranteed to arrive at an optimal global solution. This 
generally suboptimal (but efficient) approach can be summarized as 

• For given Q = Q([l : L],U), perform network decomposition using depth-first or breadth-first 



search |28|, yielding disjoint subnetworks Q(A S ,U S ) for s = 1, . . . , S. 

For each subnetwork Q(A S ,U S ), run Algorithm 2 and find a good selection A* C A s with \A* 
\U>\- 

Finally, obtain the set of active ATs, A* = uf =1 .4* such that |.4*| = \U\. 



Algorithm 2 The Greedy Algorithm 

Input: (Q, {we : £ = 1, . . . , m}) where Q is a full-rank m x n matrix with m> n 
Output: S C [1 : m] with |<S| = n 

1) Sort [1 : m] such that w% > W2 > • • • > w m 

2) Initially, I = 1 and S = 

3) If Rank(Q(5 U {£}, [1 : n])) > Rank(Q(5, [1 : n])), 
then S ^SL>{£} 

4) Set£ = £ + 1 

5) Repeat 3)-4) until \S\ = n 



We have 

Lemma 3: If Rank(Q) = n, Algorithm 2 finds a solution to the problem 

max mm{vj( : £ £ S} (45) 

SC[l:m] 

subject to Rank(Q(5, [1 : n})) = n. (46) 

Proof: Let Q be the row-permuted matrix of Q according to the decreasing ordering of the weights 
W£. The problem is then reduced to finding the minimum row index v such that Q([l : v], [1 : n]) has 
rank n. This is precisely what Algorithm 2 does. ■ 
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An immediate corollary of Lemma [3] is that, if one disregards network decomposition, then Algorithm 
2 finds the maximum computation rate over the AT selection. In fact, it is sufficient to use Algorithm 2 
with m = L, n = K, and input Q = Q([l : L],U) and u>£ = minjflo, Re} for £ = 1, . . . , L. 

2 ) AT selection for LQF: From Theorem |4| the AT selection problem consists of finding A solution 
of: 



max > mxn.{Rn,R(\ (47) 
subject to Rank(Q(^,W)) = |W|, (48) 



where we let Ri = 21ogp — H(£(h.£, a^)) (see Section V-A). This problem consists of the maxi- 
mization of linear function subject to a matroid constraint, where the matroid M. = (fl,T) is de- 
fined by the ground set = [1 : L] and by the collection of independent sets T = {A C Q, : 



Q(A,U) has linearly independent rows}. Rado and Edmonds [32 1, [33] proved that a greedy algo- 
rithm finds an optimal solution. In this case, such algorithm coincides with Algorithm 2 with input 
Q = Q([l : L],U) and w t = mm{R ,R e }. 

B. User Selection for the DAS Downlink 

In this case we assume that the set of ATs A = [1 : L] is fixed and K > L. Hence, we wish 
to select a subset U C [1 : K] of cardinality L such that the resulting system matrix has rank L 
and the DAS downlink sum rate is maximized. The CP has knowledge of the downlink system matrix 
Q([l : K],A) = [qi, . . . ,qft-] T and the set of individual user computation rates, Rk = i?(hfc, a^, SNR) 
for RCoF, or R k = 21ogp - JET(C(hfc, a*)) for RQCoF (see Theorem Q. The UT selection problem 
consists of finding U solution of: 

max > min{Ro,Rk} (49) 
subject to Rank(Q(U,A)) = \ A\. (50) 



As noticed before, this can be regarded as the maximization of linear function over matroid constraint. 
Therefore, Algorithm 2 with input Q = Q([l : K],A) and Wk = min{i?0) Rk} provides an optimal 
solution. 



C. Comparison on the Bernoulli-Gaussian Model 

We consider a DAS with channel matrix with i.i.d. elements [H]^ = ht^lifa where h^k ~ CAf(0, 1) 
and 7^fc is a Bernoulli random variable with P(je t k = 1) = Q- This model captures the presence of 



27 




Fig. 7. DAS uplink with K = 5, L — 25 and Ro = 6 bit/channel use: average sum rate vs. SNR on the Bernoulli-Gaussian 
model with q = 0.5. 
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SNR LdBJ 



Fig. 8. DAS downlink with K = 25, L = 5 and Ro = 6 bit/channel use: average sum rate vs. SNR on the Bernoulli-Gaussian 
model with q = 0.5. 
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Rayleigh fading and some extreme form of path-blocking shadowing, and it is appropriate for a DAS 
deployed in buildings, or dense urban environments where the ATs are not mounted on tall towers, in 
contrast to conventional macro-cellular systems. For the downlink results, we assume a channel matrix 
H with the same statistics. 

We compute the ergodic sum rates by Monte Carlo averaging with respect to the channel matrix. If the 
resulting system matrix, after AT (resp., UT) selection is rank deficient, then the achieved instantaneous 
sum rate is zero, for that specific realization. Hence, rank deficiency can be regarded as a sort of 
"information outage" event. With the path gain coefficients and noise variance normalization adopted 
here, the SNR coincides with the individual nodes power constraint. 

Fig. [7] shows the average sum rate for a DAS uplink with K = 5 UTs, L = 25 ATs and channel 
blocking probability q = 0.5. This result clearly show that the proposed "greedy" AT selection scheme 
yields a large improvement over random selection of a fixed number of ATs, and essentially eliminates 
the problem of system matrix rank deficiency, provided that L 3> K. The curves denoted as "random 
selection" indicate the case where a fixed number L' < L of ATs is randomly and uniformly selected, 
independent of channel realizations. For V = 25 the DAS uses all the available ATs all the time, yet its 
performance is much worse than selecting 5 ATs out of 25 according to the proposed selection scheme. 
Fig. [8] shows a similar trend for the DAS downlink. Here, random selection indicates that 5 UTs are 
chosen at random out of the 25 UTs. We notice that the sum rate vs. SNR curves for both greedy and 
random UT selection have the same slope, indicating that the rank-deficiency problem is not significant 
in both cases. However, greedy selection achieves a very evident multiuser diversity gain over random 
selection. This is not only due to selecting channel vectors with large gains, as in conventional multiuser 
diversity, but also to the fact that the greedy selection is able to choose channels that are adapted to the 
RCoF strategy, i.e., whose coefficients are well approximated by integers (up to a common scaling factor). 
It is also interesting to notice that RQCoF with greedy selection does not suffer from the rank-deficiency 
of the system matrix even for p as small as 7, in the example. This is indicated by the fact that the sum 
rate gap between RQCoF and RCoF is essentially equal to the shaping loss (0.5 bits per user). 

We compared the proposed schemes with QF (uplink) and CDPC (downlink) over the Bernoulli- 
Gaussian model. Recall that QF is a special case of QMF without binning, whose achievable sum rate 



is given in (37). In QF, more observations (i.e., more active ATs) generally improve the sum rate and 
thus AT selection is not needed for the sake of maximizing the sum rate. Yet, for a fair comparison with 
the same total backhaul capacity, we considered a greedy search that selects L' = K < L active ATs, 
by maximizing at each step the achievable sum rate. From Fig. [9] we observe that CoF outperforms QF 
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Backhaul Rate (R) 

Fig. 9. DAS uplink with K — 5 and L = 50, Bernoulli-Gaussian model with q = 0.5: Colors represent the relative gain of 
CoF versus QF (e.g., ratio of sum rates Rcof/ Rqf)- 




Backhaul Rate (R ) 

Fig. 10. DAS downlink with K — 50 and L — 5, Bernoulli-Gaussian model with q — 0.5: Colors represent the relative gain 
of RCoF versus CDPC (e.g., ratio of sum rates -Rrcof/^cdpc). 
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when Rq is small relatively to the channel SNR. In this regime, the quantization noise dominates with 



respect to the non-integer penalty. Instead, when Rq increases, eventually QF outperforms CoF. Fig. 10 
presents a comparison between RCoF and CDPC, leading to similar conclusions for the DAS downlink. 

Next, we examine the performance of the proposed low-complexity schemes QCoF, LQF, and RQCoF, 
by focusing on a small cell network scenario, where ATs and UTs are close to each other. This yields 
reflected by consider a fixed and relatively large SNR value (SNR = 25 dB in our simulation), and 



comparing performances versus Rq, which becomes the main system bottleneck. Fig. 11 shows that 
QCoF and LQF are competitive with respect to the performance of QF, with significantly lower decoding 
complexity. Furthermore, an additional remarkable feature of the lattice-based schemes is that they can 
substantially reduce the channel state information overhead. When QCoF (or LQF) with p = 7 is used, 
each AT £ only requires 2Klog(7) « 28 (with K = 5) bits of feedback per scheduling slot in order to 
forward the integer combination coefficients (i.e., = (qe t \, ..-,qe,K)) to the CP. 

In Fig. [I2j RQCoF with p = 17 can achieve the same spectral efficiency of CDPC for Rq < 5 bits 
and outperforms CZFB in the range of Rq < 6 bits. For CZFB, we made use of the standard greedy 



user selection approach |35|, [38| to find a subset of K' < K (with K' = 5 in our simulation) active 
UTs. As expected, QCoF (resp., RQCoF) can achieve the performance of QF (resp., CDPC) when the 
wired backhaul rate Rq is not over-dimensioned with respect to the capacity of the wireless channel. 
These observations point out that the proposed schemes are suitable for low-complexity implementation 
of cooperative home networks where small home-based access points are connected to the CP via digital 
subscriber line (DSL). 

VIII. Integer-Forcing Beamforming for the High-Capacity Backhaul Case 

CDPC (downlink) and QMF (uplink) are known to be optimal in the limit of Rq —> oo. In fact, they 
converge to the capacity achieving schemes of the corresponding MIMO G-BC and G-MAC channel 
models. In this case, CoF and RCoF have no merit because the impact of the non-integer penalty does 
not vanish as Rq increases. In this section we focus on the downlink in the regime of Rq — > oo. DPC is 
notoriously difficult to be implemented in practice, since it requires nested lattice coding with shaping 



lattice A of high dimension (see for example [34 j, 1 37 1). On the other hand, it is well-known that restricting 



the shaping lattice to have low dimension, in order to make the modulo-A operation of manageable 
complexity, does not provide significant performance benefits with respect to the simple Tomlinson- 
Harashima precoding approach, which is equivalent to perform shaping with the cubic lattice A = rZ [j] 



[36 1, (39[, (40J. Furthermore, it is also known that Tomlinson-Harashima precoding for the MIMO G-BC 




Fig. 12. DAS downlink with SNR = 25 dB, K = 50 and L — 5: achievable sum rates as a function of Rq. 
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does not provide significant gains with respect to simpler linear beamforming techniques, especially when 
user selection and multiuser diversity can be exploited | |35| . Therefore, linear beamforming schemes are 
often proposed as a viable tradeoff between performance and complexity. When multiuser diversity cannot 
be exploited (e.g., the number of UTs K is not large), linear beamforming may suffer from significant 
performance degradation when the channel matrix is near singular. 



For the uplink case, \22\ show proposes an Integer-Forcing Receiver (IFR) that can approach the 
performance of joint decoding with lower complexity and significantly outperforms the traditional linear 
multiuser detector schemes (e.g., the decorrelator or the linear MMSE detector) concatenated with single- 
user decoding. The main idea is that the receiver antennas are used to create an effective channel matrix 
with integer-valued coefficients, and CoF is used for the resulting integer-valued channel matrix, incurring 
no non-integer penalty. 

In this section, we present a new beamforming strategy called Integer-Forcing Beamforming (IFB), that 
produces a similar effect for the downlink. The precoding matrix B = [bi, . . . , b^] T is chosen such that 
the resulting effective channel matrix HB is integer valued, i.e., HB = A, with B = H _1 A for some 



integer matrix A. Then, RCoF can be applied as described in Section IV to the resulting integer-valued 
effective channel matrix, incurring no non-integer penalty. In short, IFB removes the non-integer penalty 
of RCoF but introduces a power penalty (as in ZFB) due to the non-unitary precoding matrix B. Notice 
that if we restrict A = I, then IFB coincides with ZFB. Therefore, by allowing A to be a general integer 
matrix, IFB performs at least as good as ZFB, and usually significantly outperforms it, since its power 
penalty can be greatly reduced. Although not investigated further in this work, we observe here that a 
more general family of scheme might be devised by trading off the linear precoder power penalty with 
the RCoF non-integer penalty, by imposing an "approximated" integer forcing condition. 
The detailed procedures of IFB for a given A (to be optimized later) is as follows: 

• Precoding over F p 2 to eliminate integer-valued interferences: Following the RCoF scheme of Sec- 
tion IV I, the CP precodes the zero-padded information messages {w^} using Q 1 = gi _1 ([A] 
mod pL[j\) as in (14 1, encodes the precoded messages {/^} into the codewords {v_g} and generates 
the channel inputs x f = [u^ + dy mod A, for £ = 1, . . . , L, where d t are dithering sequences, as 
in @. 

• Precoding over C to create integer-valued channel matrix: Using B = H A, the CP produces the 
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precoded channel inputs 









Xi 








= B 











(51) 



Letting kg denote the index of the UT destination of the ^-th message, its received signal is given 
by 



(52) 



This is a G-MAC channel as in ([3]), with integer channel coefficients. Therefore, IFB has eliminated 



the non-integer penalty of RCoF. Finally, it is immediate to check (same steps as in (15l), that the 
integer- valued interferences is eliminated by RCoF, i.e., each UT k^ decodes its own lattice code Ce 
without multiuser interference. 
The per-antenna power constraint imposes 



1 



n 



IE [||yJ 2 ] < SNR, toi£=l,...,L. 



From (51 1, we have 



^[Ifell 2 ] = E^EDlxe, 



\b e , 



t't\ 



Since x £ is uniformly distributed on Va, the constraint (53) yields 

1 r„- „oi o SNR 

n 

Hence, we have: 



-E[||xJ 2 ] =a\ 



max|||b(" 



12 . 01 



1,...,L} 



for 



(53) 
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(55) 



Theorem 7: IFB applied to a MIMO G-BC with channel matrix H G C LxL achieves the sum rate 



R lPB (H,A) = ^i2(a*,a^SNR/max{||b € 



^}|| 2 ), 



(56) 



where we let A = [ai, . . . , a^] T and H _1 A = [hi, ... , bi] T . ■ 
The optimization of A as a function of H appears to be a hard integer-programming problem without 
any particular structure lending itself to computationally efficient algorithms. Instead, we resort to the 
suboptimal approach of optimizing A with respect to the sum power, which is proportional to tr (BB H ). 
Hence, the sum-power minimization problem takes on the form 



mm 



tr ( H _1 AA H H H 



subject to Rank(A) = L. 



(57) 
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Fig. 13. Achievable ergodic sum-rates as a function of SNRs for a MIMO-BC with same number L = K — 5 of ATs (transmit 
antennas) and UTs (users), over independent Rayleigh fading. 



Writing tr (H _1 AA H H- H ) = Y,t=\ l|H _1 A([l : L},£)\\ 2 where A([l : L},£) is the £-th column of A, 



we notice that problem ( 57 1 is equivalent to finding a reduced basis for the lattice generated by H 1 . In 



particular, the reduced basis takes on the form H 1 U where U is a unimodular matrix over Z [?']. Hence, 



choosing A = U yields the minimum sum-power subject to the full rank condition in (57). In practice, 



we used the (complex) LLL algorithm [26], with refinement of the LLL reduced basis approximation by 
Phost or Schnorr-Euchner lattice search. 

We consider the DAS downlink with infinite backhaul capacity with 5 ATs and 5 UTs. The channel 



matrix H has i.i.d. elements h(k, £) ~ CM(0, 1) (independent Rayleigh fading). Fig. 13 shows the ergodic 
achievable sum rate of IFB, compared with the ergodic channel sum capacity achieved by DPC and by the 
sum rate achievable by ZFB. We notice that the proposed IFB downlink scheme significantly improves 
over ZFB and approaches the sum capacity within « 0.5 bits per user. 

IX. Conclusions 

We considered a Distributed Antenna System (DAS) where several Antenna Terminals (ATs) are 
connected to a Central Processor (CP) via digital backhaul links of rate Rq bit/s/Hz. The ATs communicate 
with several User Terminals (UTs) simultaneously and on the same bandwidth, such that the signals sent 
by the ATs interfere at each UT (downlink) and, Vice Versa, the signals sent by the UTs interfere at 
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each AT (uplink). The DAS uplink is a special case of a three-layers multi-source single-destination 
relay network, where the ATs play the role of the relays. The DAS downlink is a special case of a relay 
broadcast network with one sender and individual messages. For this setup, we considered the application 
of the Compute and Forward approach in various forms. For the DAS uplink, CoF applies directly. In this 
case, we proposed system optimization based on network decomposition and on greedy selection of the 
ATs for a given set of desired active UTs. For the DAS downlink, we proposed a novel scheme referred 
to as Reverse CoF (RCoF). This scheme reverse the role of ATs and UTs with respect to the uplink, and 
uses linear precoding over the finite field domain in order to eliminate multiuser interference. In this case, 
we considered system optimization consisting of selecting a subset of UTs for a given set of active ATs. It 
turns out that in this case the problem can be formulated as the maximization of a linear function subject 
to a matroid constraint, for which a simple greedy procedure is known to be optimal. We also considered 
strategies that incorporate the presence of a ADC at the receiver as an unavoidable part of the channel 
model. In this case, we can design lattice based strategies that explicitly take into account the presence 
of the finite resolution scalar quantizer at the receivers. In particular, this leads to very simple single-user 



linear coding schemes over ¥ q with q = p , and p a prime. Our own results in [19] and others' results 



in 1 20 1, 1 21 1 show that it is possible to approach the theoretical performance of random coding using 
g-ary LDPC codes with linear complexity in the code block length and polynomial complexity in the 
number of network nodes. For the regime of large Rq, we have also introduced a novel linear precoding 
scheme referred to as Integer Forcing Beamforming (IFB). This can be seen as a generalization of zero- 
forcing beamforming, where the beam formed channel is forced to have integer coefficients, rather than 
to a diagonal matrix. Then, RCoF can be applied to precode over the integer-valued multiuser downlink 
channel, without further non-integer penalty. 

We provided extensive comparison of the proposed lattice-based strategies with information-theoretic 
strategies for the DAS uplink and downlink, namely QMF and CDPC, known to be near-optimal. We 
observed that the proposed strategies achieve similar and sometimes better performance in certain relevant 
regimes, while providing a clear path for practical implementation, while the information-theoretic terms 
of comparisons are notoriously difficult to be implemented in practice. As a matter of fact, today's 
technology relies on the widely suboptimal decode and forward (DF) scheme for the uplink, or on 
the compressed linear beamforming approach for the downlink, which are easily outperformed by the 
proposed schemes with similar, if not better, complexity. 

As a conclusion, we wish to point out that the proposed schemes are competitive when the wired 
backhaul rate Rq is a limiting factor of the overall system sum rate. For example, in a typical home 
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Wireless Local Area Network setting, the rates supported by the wireless segment are of the order of 
10 to 50 Mbit/s, while typical DSL connection between the wireless router and the DSL central office 
(playing the role of the CP in our scenario) has rates between 1 and 10 Mb/s. In this case, the schemes 
proposed in this paper can provide a viable and practical approach to uplink and downlink centralized 
processing at manageable complexity. 

Appendix A 
Gaussian Approximation 

Let e = (p/r)Re{£j(h, a, a)} ~ CAA(0,of) with o\ = cr|/2. We consider the distribution of the 
discrete random variable v = Qj(e). The pmf of Q(h, a, a) is obtained by considering i.i.d. real and 
imaginary parts, both distributed as v. Define the function 

*(*) ^ P(*>^)-P(*>^) (58) 

= q( {2 ^)-q( {2x + 1] 



v 2a £ J "V 2a £ 

where Q(z) = iz°° ex P ( ~~ f") ^ * s me Gaussian tail function. Recall that g maps the 
{0, 1, ...,p — 1} into the set of integers {0, 1, ...,p - 1} C I. We define an interval I(x) by 



T(x) = [z- 0.5, a; + 0.5]. (59) 

The pmf of v can be computed as 

F(u = (3) = P I £ € [j X W) + P m ) I ■ (60) 

\ meZ / 

For any fa, fa / satisfying g(fa) + g(fa) = P, we have P(y = fa) = P(y = fa), which can be 
immediately proved using the symmetry of Gaussian distribution (about origin): 

ee |J l{g{fa)+pm)\ (61) 
meZ / 

eG |J l{g{fa)+pm)\ +P I ee [j l(g(fa) - p + pm)\ (62) 

meZ + U{0} / \ meZ_U{0} / 

eG |J l(#)+pm) j +P I ee |J l(p-g(fa)+pm)\ (63) 

meZ + U{0} / \ meZ + U{0} / 

€ |J l(g(Pi) + pm) j + P I e € |J X( 5 (/3 2 ) + pm) (64) 

meZ + U{0} / \ meZ + U{0} / 
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where Z + and Z_ denote the positive and negative integers, respectively. Thus, we only need to find the 



pmf of v with v < an ^ other probabilities are directly obtained by symmetry. Using the (|63|) for 



f3 / 0, we can quickly compute the pmf of v using the <£(x) defined in (58 1 



P(i/ = 0) = $(0) + 2 <S>(g(P) + pm) 

F(u = /3) = ^2 ${g(P) +pm) + ®(p- g(/3) +pm). 
meZ + u{o} 

In fact, <J>(x) is monotonically decreasing function on x and in general, quickly converges to as x 



(65) 
(66) 



increases. Therefore, we only need a finite number of summations in ([66]) and (66) and we observed that 



it is enough to sum over m = 0, 1, 2 in all numerical results presented in this paper. 



Appendix B 
Proof of TheoremH] 

Consider the FF-MAC defined by y = Qx©£ where x = (x\, ...,xk) T £ and y = (yi, uk) J £ 



¥p 2 . The capacity region is the union of the rate regions defined by [41 1 



(67) 



Y^Rk<I({x k -keS);y\{x k :eS c },q), V S C [1 : K], 

kes 

over all pmfs P Xjq = P q flfcLi ^x k \q- Since for any fixed such pmf the region (67 1 is a polymatroid, the 
maximum sum rate achieved on the dominant face ^2^ = iRk = y|<z). Since the expectation of the 
maxima is larger or equal to the maximum of the expectation, we have that J2k=i Rk — max P x , ^( x ; y\l)- 
Finally, since q — > x — > y, we have: 



y\q) < /(x, q; y) = J(x; y) + I(q; y|x) = J(x; y), 



(68) 



showing that time-sharing is not needed for the maximum sum rate. Since Q is full rank, uniform i.i.d. 
inputs ¥ p 2 achieve 



J(x;y) = H(y) - tf(y|x) = H(y) - E(Q 
< 2Klogp- H(C). 



(69) 
(70) 



Finally, since Yl<k=i H{Ck) — H((i, ...,(k), we conclude that the sum rate in (29l is achievable 



38 



Appendix C 
Proof of Theorem[5] 

We consider the FF-BC denned by y = Qx©£, where x = (x\, ...,xl) t and y = (yi, ...,y^) T . Since 
Q is invertible, letting x = Q x v for v G F^ 2 yields the orthogonal BC y = v © £. The achievable 
sum rate for this decoupled channel is obviously given by the sum of of the capacities of each individual 
additive-noise finite-field channel, irrespectively of the statistical dependence across the noise components. 
Each ^-th channel capacity is achieved by letting v i.i.d. with uniformly distributed components over F p 2. 



It follows that the sum rate (31) is achievable. In order to show that this is in fact the sum-capacity of 



the FF-BC, we notice that a trivial upper-bound on the broadcast capacity region is given by [41] 



Rl < max I(x; y^) for £ = 1, . . . , L. (71) 

This is the capacity of the single-user channel with transition probability P ye u. Due to the additive noise 
nature of the channel, we have I(x; yi) = H(yi) — H(Q). Furthermore, H{ye) < 21ogp and this upper 
bound is achieved by letting x ~ Uniform over Ffi. Summing over i we find that the upper bound on 



the sum capacity coincides with (31 
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